25 research outputs found

    Visual object tracking performance measures revisited

    Get PDF
    The problem of visual tracking evaluation is sporting a large variety of performance measures, and largely suffers from lack of consensus about which measures should be used in experiments. This makes the cross-paper tracker comparison difficult. Furthermore, as some measures may be less effective than others, the tracking results may be skewed or biased towards particular tracking aspects. In this paper we revisit the popular performance measures and tracker performance visualizations and analyze them theoretically and experimentally. We show that several measures are equivalent from the point of information they provide for tracker comparison and, crucially, that some are more brittle than the others. Based on our analysis we narrow down the set of potential measures to only two complementary ones, describing accuracy and robustness, thus pushing towards homogenization of the tracker evaluation methodology. These two measures can be intuitively interpreted and visualized and have been employed by the recent Visual Object Tracking (VOT) challenges as the foundation for the evaluation methodology

    Beyond standard benchmarks: Parameterizing performance evaluation in visual object tracking

    Get PDF
    Object-to-camera motion produces a variety of apparent motion patterns that significantly affect performance of short-term visual trackers. Despite being crucial for designing robust trackers, their influence is poorly explored in standard benchmarks due to weakly defined, biased and overlapping attribute annotations. In this paper we propose to go beyond pre-recorded benchmarks with post-hoc annotations by presenting an approach that utilizes omnidirectional videos to generate realistic, consistently annotated, short-term tracking scenarios with exactly parameterized motion patterns. We have created an evaluation system, constructed a fully annotated dataset of omnidirectional videos and the generators for typical motion patterns. We provide an in-depth analysis of major tracking paradigms which is complementary to the standard benchmarks and confirms the expressiveness of our evaluation approach

    A hierarchical adaptive model for robust short-term visual tracking

    Get PDF
    Visual tracking is a topic in computer vision with applications in many emerging as well as established technological areas, such as robotics, video surveillance, human-computer interaction, autonomous vehicles, and sport analytics. The main question of visual tracking is how to design an algorithm (visual tracker) that determines the state of one or more objects in a stream of images by accounting for their sequential nature. In this doctoral thesis we address two important topics in single-target short-term visual tracking. The first topic is related to construction of an object appearance model for visual tracking. The modeling and updating of the appearance model is crucial for successful tracking. We introduce a hierarchical appearance model which structures object appearance in multiple layers. The bottom layer contains the most specific information and each higher layer models the appearance information in a more general way. The hierarchical relations are also reflected in the update process where the higher layers guide the lower layers in their update while the lower layers provide a source for adaptation to higher layers if their information is reliable. The benefits of hierarchical appearance models are demonstrated with two implementations, primarily designed to tackle tracking of non-rigid and articulated objects that present a challenge for many existing trackers. The first example of appearance model combines local and global visual information in a coupled-layer appearance model. The bottom layer contains a part-based appearance description that is able to adapt to the geometrical deformations of non-rigid targets and the top layer is a multi-modal global object appearance model that guides the model during object appearance changes. The experimental evaluation shows that the proposed coupled-layer appearance model excels in robustness despite the fact that is uses relatively simple appearance descriptors. Our evaluation also exposed several weaknesses that were reflected in a decreased accuracy. Our second presented appearance model extends the hierarchy by introducing the third layer and a concept of template anchors. The first two layers are conceptually similar to the original two-layer appearance model, while the third layer is a memory system that is composed of static templates that provide a strong spatial cue when one of the templates is matched to the image reliably, thus assisting in quick recovery of the entire appearance model. In the experimental evaluation we show that this addition indeed improves the accuracy, as well as the overall performance of a tracker. The second question that we are addressing is the performance evaluation of single-target short-term visual tracking algorithms. In contrast to the dominant trend in the past decades, we claim that visual tracking is a complex process and that the performance of visual trackers cannot be reduced to a single performance measure, nor should it be described by an arbitrary set of measures where the relationship between measures is not well understood. In our research we investigate performance measures that are traditionally used in performance evaluation of single-target short-term visual trackers, through theoretical and empirical analysis, and show that some of them are measuring the same aspect of tracking performance. Based on our analysis we propose a pair of two weakly correlated measures to measure the accuracy and robustness of a tracker, propose a visualization of the results as well as the analysis of the entire methodology using the theoretical trackers that exhibit extreme tracking behaviors. This is followed by an extension of the methodology on ranking of multiple trackers where we also take into account the potentially stochastic nature of visual trackers and test the statistical significance of performance differences. To support the proposed evaluation methodology we have developed an open-source software tool that implements the methodology and a simple communication protocol that enables a straightforward integration of trackers. The proposed evaluation methodology and the evaluation system have been adopted by several Visual Object Tracking (VOT) challenges

    Robust visual tracking using template anchors

    Get PDF

    The Ninth Visual Object Tracking VOT2021 Challenge Results

    Get PDF
    acceptedVersionPeer reviewe

    Visual tracking of non-rigid objects

    Get PDF
    In this thesis we study the field of visual tracking of non-rigid, articulated objects. For this task a typical visual model, used mostly for the description of rigid objects has to be extended in a way that it can adapt to the deformations of such objects. In our work we present an extension that is based on a hierarchical approach towards visual model construction. It is based on a hierarchical combination of local and global visual information. The resulting visual model extends the existing visual models that use a set of local features connected with geometrical constraints. This set represents the bottom layer of the presented visual model. Using local features the visual model builds a multi-modal representation of the object’s appearance that represents the top layer of the model. Based on this information an area of the object in a frame is determined, and based on this area, the local feature set is updated with new features. In the thesis our work is first placed into a research context by describing recent published work on visual tracking of non-rigid objects. Next, the proposed visual model is described in detail together with its integration in a simple tracker. The performance of the tracker is assessed in various experiments using nine different video sequences. Advantages and disadvantages of the tracker are shown in comparison of the tracker with three different state-of-the-art visual trackers. The thesis is concluded with a discussion in which some theoretical and practical limitations of the presented visual model are laid out as well as some ideas for further development
    corecore